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Summary 


Problem 


Many  laboratory  investigations  of  the  effects  of  stressors  on  human  performance  employ 
performance  assessment  batteries  consisting  of  sequential  tests  of  isolated  cognitive  attributes  and  abilities. 
However,  operational  jobs  usually  require  concurrent  performance  of  two  or  more  tasks,  each  with  its  own 
set  of  requirements  and  consequences.  Thus,  conclusions  regarding  the  operational  impact  of  stressors 
based  on  traditional  tests  may  be  questioned. 

Objective 

The  objective  of  this  study  was  to  compare  performance  on  a  traditional  sequential  test  battery 
with  that  on  a  synthetic  work  task  (SYNWORKl)  requiring  subjects  to  work  concurrently  on  several 
tasks. 


Approach 

Subjects  were  tested  every  three  hours  during  64  hr  of  sleep  deprivation  on  a  battery  of  tests 
including  both  traditional  sequential  attribute  tests  and  SYNWORKl. 

Results 


Performance  decrements  began  to  appear  during  the  first  night  of  lost  sleep,  becoming  more  severe 
during  the  second  night.  Decrements  were  greatest  in  the  early  morning  hours,  recovering  somewhat 
^iiiring  the  day.  Performance  decrements  were  usually  more  severe  on  the  sequential  tests  than  on 
comparable  measures  from  SYNWORKl,  and  sequential  tasks  showed  greater  sensitivity  to  sleep 
deprivation  (day  effect  on  ANOVA).  However,  SYNWORKl  measures  showed  a  stronger  circadian 
rhythm  (time  of  day  effect  on  ANOVA)  than  comparable  measures  from  the  sequential  tests,  and  the 
composite  SYNWORKl  score  showed  a  stronger  circadian  rhythm  than  any  individual  task  from  the 
sequential  tests.  For  the  sequential  tests,  but  not  SYNWORKl,  performance  efficiency  (throughput) 
tended  to  be  more  sensitive  to  sleep  deprivation  than  either  speed  or  accuracy.  Subjects  reported  that 
SYNWORKl  was  more  interesting  than  the  sequential  tests. 

Conclusions 


These  results  confirm  long-standing  reports  in  the  literature  that  complex  and  interesting  tasks  are 
less  sensitive  to  disruption  by  sleep  deprivation  than  boring  and  uninteresting  tasks,  thus  highlighting  the 
importance  of  motivation  in  determining  the  outcome  of  laboratory  performance  tests.  The  synthetic  work 
approach  represents  an  attempt  to  bring  some  of  the  structural  and  functional  complexities  of  operational 
enivirnnmftnts  into  the  laboratory,  permitting  rigorous  investigation  of  the  effects  of  stressors  on  ■ 
performance  under  more  realistic  conditions  than  those  provided  by  traditional  sequential  tests  of 
attributes  and  abUities.  Decisions  regarding  what  type  of  test  is  appropriate  for  a  given  application 
require  consideration  of  trade-offs  between  generality  and  validity  of  results  and  cost  inherent  in  different 
approaches  to  performance  assessment.  .  a  :.cat.io!3_ 
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Introduction 


Techniques  for  assessment  of  human  cognitive  performance  span  a  broad  range,  from 
questionnaires,  paper-and-pencil  tests,  computerized  tests  of  cogmtive  and  motor  abilities,  to  simulators 
and  field  exercises.  Due  to  the  availability  of  powerful  personal  computers,  recent  years  have  seen  a 
concentration  of  interest  on  the  middle  of  this  continuum,  computerized  testing  in  the  form  of  performance 
assessment  batteries  or  PABs  (Anger,  1990;  England,  Reeves,  Shingledecker,  Thome,  Wilson,  &  Hegge, 
1985;  Kennedy  et.  al,  1981,  Thome  et.  al  1985).  These  batteries  consist  of  tests  that  are  each  designed  to 
measure  a  limited  subset  of  abilities,  in  relative  isolation  from  one  another  (Anger,  1990;  Perez,  Ramsey, 
Masline,  &  Urban,  1987).  While  performance  on  PABs  is  sensitive  to  a  wide  range  of  variables,  the 
relationship  between  such  effects  and  performance  under  operational  conditions  is  sometimes  difficult  to 
characterize.  At  best,  it  can  be  said  that  if  a  given  variable  produces  a  decrement  on  PAB  performance,  it 
may  degrade  performance  under  certain  operational  conditions. 

Work  situations  outside  the  laboratory,  particularly  in  operational  military  settings,  usually  involve 
performance  of  more  than  one  task  at  a  time,  with  priorities  assigned  by  instmctions  and  contingencies 
associated  with  performance  of  the  component  tasks.  Extensive  literature  demonstrates  that  performance 
differs  under  dual-task  conditions,  compared  to  single-task  performance  (e.g..  Chiles,  1982;  Damos,  1989). 
Thus,  determination  of  effects  on  dual-  or  multiple-task  performance  is  a  highly  desirable  step  in  the 
evaluation  of  the  performance  effects  of  any  stressor  that  might  adversely  affect  military  performance. 

This  study  compared  effects  of  64  h  of  sleep  deprivation  on  performance  of  a  synthetic  work  task 
(Synworkl)  requiring  subjects  to  simultaneously  perform  four  concurrent  tasks  (cf  Alluisi,  1967),  with 
performance  on  several  standard  PAB  measures. 


Method 


Subjects 

The  nine  male  subjects,  enlisted  personnel  in  either  the  Navy  or  the  Marine  Corps,  were  recruited 
from  a  variety  of  sources.  Most  had  completed  high  school,  but  none  had  completed  four  years  of  college. 
Depending  on  availability,  one  to  three  subjects  were  run  on  any  given  week. 

Apparatus 

Four  IBM/AT-compatible  computers  (Unisys  386,  20  Mhz,  color  VGA  monitor,  80  Mb  hard  disk. 
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Microsoft  mouse)  were  used  in  this  study.  The  computers  were  located  in  a  single  room,  with  testing 
stations  separated  by  low  partitions. 

Sleep  deprivation  study 

On  the  day  subjects  were  brought  into  the  laboratory,  two  training  sessions  were  conducted.  In 
these  sessions,  detailed  instructions  concerning  each  test  were  provided,  as  well  as  trial-by-trial  feedback  on 
performance  during  the  tests.  The  duration  of  training  sessions  was  the  same  as  described  below  for  the 
testing  sessions.  When  a  subject  had  difficulties  with  a  particular  test,  extra  practice  on  that  test  was 
permitted.  Following  the  training  day,  no  instructions  or  trial-by-trial  feedback  were  provided  for  any  of 
the  tests,  except  as  described  below  for  Synworkl.  Subjects  slept  in  the  laboratory,  and  were  awakened  at 
0600.  A  Performance  Assessment  Battery  consisting  of  eleven  tests,  including  Synworkl,  was 
administered  at  0900,  and  every  three  hours  thereafter  for  the  duration  of  the  study.®  Approximately  110 
minutes  were  required  ft)r  completion  of  the  battery.  We  will  present  data  from  S3mworkl  and  nine  of  the 
PAB  tests. 

PAB  tests 

Simple  reaction  time.  In  this  test,  a  small  rectangle  was  displayed  in  the  center  of  the  personal 
computer  (PC)  screen.  Subjects  were  instructed  to  press  either  button  of  the  mouse  and  hold  it  down  when 
the  square  turned  light  blue,  and  to  release  the  button  as  rapidly  as  possible  when  the  square  turned  green. 

In  reaction  time  terminology,  the  blue  light  was  the  “foreperiod”,  and  the  green  light  the  “trigger .” 

Intertrial  intervals  (ITIs)  were  2  s.  The  box  remained  blue  for  a  random  time  ranging  from  0.6  s  to  1.2  s. 
Premature  releases  or  presses  initiated  a  new  ITI.  Sessions  terminated  after  5  min  or  50  trials. 

Go-no-go  reaction  time.  This  test  was  similar  to  the  simple  reaction  time  test  previously  described, 
except  that  on  50%  of  the  trials,  the  square  turned  red  rather  than  green,  remaining  red  for  1  s  at  which 
time  it  turned  blue  again.  On  these  trials,  the  subject  was  instructed  to  hold  the  mouse  button  until  the 
square  returned  to  blue.  Sessions  terminated  after  5  min  or  50  trials. 


®  Test  order  was  as  follows:  Matrix  test  (spatial  memory;  Two-column  Addition;  Logical  Reasoning  (3-letter); 
Simple  Reaction  Time;  Complex  Reaction  Time;  10-min  Break;  Four-choice  Reaction  Time;  Digit-symbol 
substitution;  Logical  Reasoning  (2,3,4-letter);  Word  Memory;  Synworkl;  Tapping  (sleepiness  test). 
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Four-choice  rpartinn  timfl  This  test  adhered  to  the  standards  described  in  Perez  et  al.  (1987).  A 
5.0-inch  square  divided  into  four  quadrants  was  displayed  in  the  center  of  the  PC  screen.  On  each  trial,  a 
.75  inch  plus  (+)  sign  was  displayed  in  the  center  of  a  randomly  selected  quadrant  of  the  square,  and 
subjects  were  instructed  to  press  the  7,  9,  1,  or  3  key  of  the  numeric  keypad  of  the  keyboard  with  the  index 
finger  of  the  preferred  hand.  The  correct  key  depended  on  the  quadrant  in  which  the  plus  sign  was 
displayed  (e.g.,  the  7  key  for  upper  left  quadrant,  9  for  upper  right).  The  plus  sign  disappeared  as  soon  as 
the  correct  key  was  pressed,  reappearing  after  a  1-s  I’l  l.  Test  duration  was  1 1  min. 

Two-column  addition.  Five  randomly  selected  two-digit  numbers  were  displayed  one  above  the 
other  in  a  column  in  the  center  of  the  screen  with  a  line  drawn  beneath.  Characters  were  double  the  normal 
height  and  width  of  PC  characters.  The  subject's  task  was  to  add  the  numbers  and  enter  the  answer  using 
the  number  keys  on  the  PC  keyboard.  When  the  first  digit  of  the  answer  was  entered,  the  problem 
disappeared.  As  soon  as  the  “Enter”  key  was  pressed,  the  screen  was  cleared  and  a  new  problem  appeared. 
Test  duration  was  10  min. 

Word  memory.  In  this  task,  a  list  of  10  words  was  presented  for  10  s.  After  a  brief  delay,  20 
probe  words  were  presented,  including  the  10  words  from  the  list,  and  10  nonlist  words.  Presentation  order 
was  random.  For  each  word,  the  subject  was  required  to  press  “T”  or  “U”  on  the  keyboard.  Test  duration 
was  10  min  or  until  10  lists  were  completed. 

Three-letter  logic.  The  subject  was  presented  with  a  three-letter  sequence  containing  a  random 
permutation  of  the  letters  A,  B,  and  C,  and  two  statements  about  the  sequence.  For  example,  one  trial 
might  consist  of  the  following  screen: 

A  FOLLOWS  B 
A  DOES  NOT  FOLLOW  C 
AC  B 


The  subject’s  task  was  to  press  ‘T”  if  both  statements  were  true,  and  “U”  if  one  or  both  statements  were 
untrue.  A  new  problem  appeared  as  soon  as  a  key  was  pressed.  Test  duration  was  20  min. 
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Two-  Three-  Four-letter  logic.  In  this  task,  one  third  of  the  trials  were  identical  to  the  three-letter 
logic  task.  On  one  third  of  the  trials  only  two  letters  and  one  statement  was  presented,  and  on  the 
remaining  third  of  the  trials,  four  letters  and  three  statements  were  presented.  Test  duration  was  5  min  or 
30  problems. 

Matrix.  On  each  trial,  the  subject  was  presented  with  a  screen  containing  14  asterisks  arranged  in 
a  14  by  14  matrix  such  that  each  asterisk  was  in  a  unique  row  and  column.  The  display  remained  on  the 
screen  for  1.5  s,  then  the  screen  went  blank.  After  3.5  s,  the  matrix  was  re-displayed,  either  the  same  as 
before,  or  with  four  of  the  asterisks  moved  to  new  locations.  The  subject  was  instructed  to  press  “S”  if  the 
matrix  was  the  same,  or  “D”  if  it  was  different.  Test  duration  was  10  min. 

Digit-Svmbol  Substitution.  In  each  session,  the  digits  0  to  9  were  displayed  in  a  row  at  the  top  of 
the  screen,  and  a  row  of  symbols  was  displayed,  one  below  each  digit.  The  position  of  the  symbols  was 
randomized  each  session.  In  each  trial,  a  random  digit  was  displayed  and  the  subject  s  task  was  to  press 
the  symbol  key  associated  with  that  digit.  A  new  digit  appeared  as  soon  as  a  key  was  pressed.  Test 
duration  was  5  min. 

Tapping.  In  this  task,  subjects  were  instructed  to  press  a  mouse  button  once  per  second  for  the 
duration  of  the  task.  To  serve  as  a  pacing  stimulus,  for  the  first  10  s,  a  .75-inch  (1.9  cm)  diameter  red  spot 
flashed  in  the  center  of  the  screen  for  the  first  0.2  s  of  each  1-s  period.  After  10  s  had  passed,  the  screen 
remained  a  uniform  dark  gray  for  the  remainder  of  the  session.  Test  duration  was  5  min. 

Synthetic  work  test:  SYNWORKl 

The  computer  program,  SYNWORKl  (referred  to  in  the  remainder  of  the  paper  as  Synwork)  has 
been  described  in  detail  elsewhere  (Elsmore,  1991,  1994;  Elsmore,  Naitoh,  &  Linnville,  1992),  and  will  be 
briefly  described  here.  The  program  presents  four  simultaneous  tasks  on  the  screen  of  an  IBM-compatible 
PC.  The  tasks  comprising  Synwork  were  selected  to  provide  a  generic  work  environment  in  which  the 
operator  is  required  to  remember  and  classify  items  on  demand,  perform  a  self-paced  task  (arithmetic 
problems),  and  monitor  and  react  to  both  visual  and  auditory  stimuli.  No  attempt  was  made  to  simulate 
any  particular  job  or  system,  although  several  users  have  suggested  that  the  program  provides  a  reasonable 
part-simulation  of  various  watchstanding  jobs. 
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Subjects  interact  with  Synwork  using  a  standard  mouse,  permitting  the  subject  to  concentrate  on 
the  information  on  the  screen,  eliminating  the  distractions  of  a  keyboard.  Synwork  allows  researchers 
broad  latitude  in  the  choice  of  task  parameters  (e.g.,  subtask  difficulty,  payoff  matrix).  The  four  subtasks 
are  each  displayed  in  one  quadrant  (“window”)  of  the  screen.  The  program  uses  a  Sound  Blaster  sound 
card  for  presentation  of  all  auditory  stimuli.  Subjects  wore  headphones  during  the  Synwork  portion  of  the 
test  session.’  Correct  responses  produced  a  high-pitched  “squeaking”  sound,  and  errors  produced  a  low- 
pitched  “burping”  sound.  The  following  paragraphs  describe  the  details  of  each  subtask  in  the  present 
experiment: 

Upper  left  window.  Sternberg  memory  task.  Each  session,  a  list  of  letters  (the  positive  list)  was 
randomly  chosen  from  the  alphabet  with  the  exceptions  of  the  letters  C,  D,  M,  Q,  and  V  which  were  not 
used.  Letters  were  displayed  in  uppercase  in  a  box  at  the  top  of  the  window.  In  the  present  study,  the  list 
was  6  letters  long.  At  the  same  time,  a  "negative  list"  of  the  same  length  was  selected  from  the  remaining 
letters  of  the  alphabet.  At  the  beginning  of  the  session,  the  positive  list  was  displayed  for  only  5  s,  after 
which  it  was  replaced  by  the  words  "RETRIEVE  LIST".  When  this  message  was  displayed,  clicking  the 
mouse  on  the  list  box  resulted  in  display  of  the  list  for  another  5  sec.  The  list  could  be  retrieved  as  many 
times  as  required  during  the  session,  although  a  10-point  penalty  was  charged  when  the  list  was  retrieved  . 
At  the  beginning  of  each  trial,  a  sample  letter  was  displayed  in  a  box  in  the  center  of  the  window.  The 
letter  was  randomly  selected  from  either  the  positive  or  negative  list  (p  =  .5)  The  subject's  task  was  to 
indicate,  by  clicking  the  mouse  on  either  the  “YES”  or  the  “NO”  box  at  the  bottom  of  the  window,  whether 
the  letter  was  a  member  of  the  positive  set  or  not.  Ten  points  were  awarded  for  each  correct  choice,  and  a 
10-point  penalty  was  imposed  for  each  error.  Following  each  choice  the  sample  letter  disappeared  for  the 
remainder  of  the  trial.  Trials  were  20  s  in  duration. 

Upper  right  window,  arithmetic  task.  A  three-column  addition  task  presented  two  randomly 
selected  three-digit  numbers  between  100  and  999,  with  “0000”  in  the  answer  space.  The  subject’s  task 
was  to  adjust  the  answer  by  clicking  on  “+”  and  “-”  boxes  below  each  character  of  the  answer.  Clicking  on 
a  box  labeled  “DONE”  at  the  bottom  of  the  window  resulted  in  the  presentation  of  a  new  problem,  award 
of  10  points  for  correct  answers,  and  subtraction  of  10  points  for  errors.  There  were  no  time  limits  for 
completion  of  this  task  or  penalties  for  noncompletion,  thus  it  was  completely  self-paced. 


’’  The  Synwork  1  program  and  a  detailed  reference  manual  are  available  on  request  from  the  senior  author. 
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T/^wer  left  window,  visual  monitoring.  A  pointer  moved  from  the  center  of  a  graduated  scale,  201 
pixels  in  length,  towards  either  end  at  a  fixed  rate  of  200  msec  per  pixel.  Clicking  the  mouse  on  a  box 
labeled  “RESET”  at  the  top  of  the  window  reset  the  pointer  to  the  center.  The  subject's  task  was  to 
prevent  the  pointer  from  reaching  the  end  of  the  scale.  Points  were  awarded  for  each  reset,  with  the  number 
of  points  being  proportional  to  the  distance  of  the  pointer  from  the  center  at  the  time  of  reset  up  to  a 
maximum  of  10  points  for  resets  near  the  ends  of  the  scale.  Ten  points  were  deducted  for  each  second  the 
pointer  was  at  either  end  of  the  scale. 

T.nwer  right  window,  auditory  monitoring.  A  brief  tone  was  sounded  every  5  s.  The  tone  was 
either  of  two  frequencies,  low  (1046  Hz)  or  high  (1319  Hz).  The  subject's  task  was  to  click  the  mouse  in  a 
box  at  the  top  of  the  window  labeled  “HIGH  TONE  REPORT”  following  a  high  tone.  The  probabiUty  of 
high  tones  was  0.2.  Correct  responses  were  those  following  high  tones  prior  to  the  next  scheduled  tone.  All 
other  responses  were  incorrect.  Ten  points  were  awarded  for  each  correct  response  and  were  deducted  for 
each  error. 


Results 

Because  of  the  differences  between  tasks,  no  direct  statistical  comparisons  were  made  among 
different  measures.  Rather,  each  measure  was  analyzed  separately.  Comparisons  among  measures  will 
only  be  made  with  reference  to  the  presence  or  absence  of  significant  sleep  deprivation  or  time  of  day 
effects  relative  to  the  baseline  for  that  measure.  Table  1  presents  the  results  of  repeated-measures 
ANOVAs  for  measures  of  response  speed,  accuracy,  and  performance  efficiency  for  all  of  the  PAB  tests 
except  the  tapping  test,  and  for  Synwork  subtests  from  midnight  on  Day  2  to  the  end  of  the  experiment  (the 
2100  session  on  Day  3).  Thus,  there  were  2  levels  for  day  (Days  2  and  3),  and  8  levels  for  time  of  day  (the 
8  sessions  conducted  each  day). 

In  Table  1,  the  Response  Speed  measure  used  was  the  reciprocal  of  the  average  time  required  to 
make  a  correct  response.  Response  Accuracy  was  simply  the  percent  correct  responses  on  each  task.  As  an 
estimate  of  Performance  Efficiency  for  the  PAB  tests,  throughput,  defined  by  Thome  (1983)  as  correct 
responses  per  working  minute,  was  used.  For  Synwork,  the  composite  score  and  the  scores  for  each 
individual  task  were  used.  This  table  does  not  show  any  clear  differences  between  the  PAB  tests  and 
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Synwork.  It  is  interesting  to  note  that  performance  efficiency  seemed  to  be  a  more  sensitive  indicator  of 
sleep  deprivation  effects  than  either  of  the  two  measures  upon  which  it  is  based. 


Table  1 

Analysis  of  Variance  p-Values  for  Day  and  Time  of  Day  (Circadian  Rhythm) 


Response  Speed  Response  Accuracy  Performance  Efficiency 


Test 

Day 

Time  ofDay 

Day 

Time  ofDay 

Day 

Time  ofDay 

Matrix 

• 

• 

0.030 

0.026 

• 

• 

Two-column  Addition 

0.069 

0.160 

0.006 

0.332 

0.001 

0.017 

Four-choice 

0.029 

0.199 

0.319 

0.373 

0.004 

0.027 

Simple  RT 

0.974 

0.434 

« 

0.036 

0.196 

Go-No-Go  RT 

0.080 

0.060 

0.074 

0.227 

0.011 

0.080 

Logic  (3) 

0.010 

0.296 

0.084 

0.165 

0.001 

0.129 

Logic  (2,3,4) 

0.195 

0.337 

0.056 

0.135 

0.604 

0.370 

Digit-Symbol  Substitution 

0.115 

0.371 

0.109 

0.483 

0.013 

0.197 

Word  Memory 

0.292 

0.386 

0.002 

0.134 

0.385 

0.093 

Synwork  overall  rate 

0.211 

0.028 

» 

* 

* 

♦ 

Synwork  comp,  score 

* 

* 

« 

* 

0.019 

0.013 

Synwork  memory 

0.133 

0.367 

0.408 

0.046 

0.279 

0.041 

Synwork  addition 

0.077 

0.003 

0.013 

0.145 

0.039 

0.015 

Synwoik  visual  mon. 

0.008 

0.384 

* 

« 

0.017 

0.141 

Synworic  auditory  mon. 

0.124 

0.151 

0.096 

0.093 

0.131 

0.142 

Performance  on  most  tasks  reached  a  peak  at  2100  on  Day  1.  At  this  point,  the  tasks  were  well- 
leamed,  and  sleep  deprivation  was  minimal.  Beginning  with  the  next  session,  sleep  loss  began  to  take  is 
toll,  and  performance  deteriorated.  To  characterize  these  performance  decrements,  a  series  of  t-tests  was 
conducted  in  which  performance  at  each  test  time  on  Days  2  and  3  was  compared  with  the  2100  session  on 
Day  1 .  Thus,  significant  effects  represent  departure  from  the  optimal  performance  for  that  task. 


*  Degrees  of  freedom:  Day  (1,8);  Time  ofDay  (7,56),  adjusted  with  the  Greenhouse-Geiser  correction  for  repeated  measures. 
•  Missing  data  *  Not  defined 
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Percentage  difference  from  baseline  (see  text).  p  <  .05;  p  <  .01 
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SYNl  composite  score  104.4  103.0  84.8  88.3  106.9  102.6  107.6  92.7  83.5^  84.4  59.4-^  78.3  104.4  96.7  109.7  91.9 

SYNl  memory  114.2  125.6  80.9  109.1  103.7  126.7  153.9  132.9  119.7^158.3^  75.0^101.3  191.6  153.r  129.7  50.3 

SYNl  addition  103.4  84.5  58.6^  76.5^  101.8  91.3  105.7  73.3  67.9  51.4  37.6  74.6  86.4  81.0  103.6  83.0 

SYNl  visual  mon.  101.6  102.4  100.5  91.9  106.1  104.5  104.6  102.0  94.2  89.8  67.4  78.8  96.6  105.3  107.6  99.7 

SYNl  auditory  mon.  111.0^119.8^124.2'^  105.1  122.0'^  109.7^  98.6  86.7  102.2  100.3  73.1"  93.7  115.r  97.3  107.6-^  99.6 


Table  2  shows  normalized  group  means  for  response  speed,  accuracy,  and  performance  efficiency 
for  all  of  the  tasks  except  the  tapping  task.  These  values  were  calculated  by  converting  the  scores  for  each 
subject  and  test  to  the  percentage  of  baseline  for  that  subject  and  test,  where  baseline  was  the  mean  score 
for  the  5  sessions  on  Day  1,  then  averaging  these  normalized  values  to  obtain  a  group  mean.  To  provide 
statistical  estimates  of  the  significance  of  departures  from  baseline,  t-tests  were  conducted  for  each  test 
comparing  performance  at  each  session  on  Days  2  and  3  with  the  last  session  on  Day  1.  In  almost  all  cases, 
significant  departures  from  baseline  represented  decreases.  In  a  few  cases,  however,  most  notably 
SYNWORK  visual  monitoring  response  speed,  there  were  significant  increases.  To  simplify  comparisons 
among  measures,  the  differences  are  summarized  in  Table  3,  which  shows  the  percentages  of  sessions 
deviating  from  baseline. 


Table  3 

Percentage  of  Sessions  Significantly  Different  From  Baseline 


Dav  2 

Day  3 

Response  Speed 

PAB 

75.0 

75.0 

Synwork 

22.5 

47.5 

Response  Accuracy 

PAB 

28.1 

54.7 

Synwork 

8.3 

33.3 

Performance  Efficiency 

PAB 

84.4 

96.9 

Synwork 

17.5 

22.5 

For  the  purposes  of  the  present  paper,  the  most  important  fact  to  be  seen  in  Tables  2  and  3  is  that 
Synwork  measures  showed  fewer  significant  departures  from  baseline  than  the  PAB  measures.  This  was 
true  regardless  of  response  measure  or  day.  Secondly,  as  expected,  greater  efrects  were  seen  for  most 
measures  on  Day  3  than  on  Day  2.  Third,  some  apparent  time-of-day  effects  were  evident,  mostly  on 
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Day  3.  Finally,  Table  2  suggests  that  accuracy  measures  were  the  least  sensitive  to  sleep  deprivation, 
although  performance  efficiency  for  Synwork  was  also  relatively  insensitive. 

A  number  of  figures  will  better  illustrate  some  of  the  differences  between  PAB  and  Synwork 
performance.  Figure  1  shows  normalized  group  mean  response  speed  for  Synwork  addition  and  the  PAB 
two-column  addition  tasks.  For  both  tasks,  response  speed  was  calculated  as  the  reciprocal  of  the  time 
required  to  complete  a  correct  response.  Sleep  deprivation  clearly  had  a  much  more  dramatic  effect  on  the 
PAB  task  than  on  the  Synwork  task.  Both  declined  during  the  morning  hours,  but  only  the  Synwork  task 
recovered  during  afternoon  and  evening  sessions. 


Time  Awake  (hours) 


Figure  1.  Normalized  plots  of  response  speed  (reciprocal  of  the  time  required  to 
complete  a  correct  response).  Points  are  means  of  nine  subjects.  Mean  scores  from 
Day  1  were  used  as  control  values.  Filled  points  are  significantly  (p  <  .05)  different 
from  the  last  session  on  Day  1.  Vertical  reference  lines  indicate  midnight. 
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In  Figure  2,  normalized  accuracy  scores  on  the  Synwork  addition  subtask  are  compared  with 
accuracy  on  the  two-column  addition  PAB  task.  This  figure  shows  that  accuracy  on  both  tasks  declined  as 
sleep  deprivation  increased,  reaching  a  minimum  during  the  early  morning  hours  of  Day  3.  Unlike  the  most 
other  comparisons  made  in  this  paper,  there  was  no  clear  difference  between  the  PAB  measure  and  the 
Synwork  measure,  although  there  was  a  tendency  for  the  Synwork  measure  to  recover  more  than  the  PAB- 
based  measure  in  the  afternoon  and  evening  of  each  day. 


Time  Awake  (hours) 


Figure  2.  Normalized  accuracy  scores  of  the  Synwork  addition  task  and  2-coIumn 
addition.  Points  are  means  of  nine  subjects.  Filled  points  are  significantly  (p  <  .05) 
different  from  the  last  session  on  Day  1. 


As  indicated  earlier,  measures  of  performance  efficiency  appear  to  be  most  sensitive  to  sleep' 
deprivation  effects.  To  illustrate  this  point,  as  well  as  the  difference  between  PAB  and  Synwork 
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performance.  Figure  3  compares  average  throughput  of  8  PAB  tasks  with  the  Synwork  composite  score 
(i.e.,  the  sum  of  the  4  subtask  scores).  This  figure  clearly  shows  that  PAB  throughput  decreased  as  sleep 
deprivation  increased,  remaining  substantially  below  baseline  levels  imtil  the  end  of  the  experiment. 
Synwork  performance,  on  the  other  hand,  decreased  in  the  morning  hours  and  returned  to  baseline  in  the 
afternoon  and  evening. 


Time  Awake  (hours) 

Figure  3.  Normalized  performance  efficiency  scores.  PAB  scores  were  averaged 
across  eight  different  PAB  tasks.  Points  are  means  of  nine  subjects. 


The  measures  discussed  up  to  this  point  focus  on  task  performance,  although  failures  to  respond 
do  contribute  to  response  speed  and  performance  efficiency  measures.  In  real-world  situations,  failures  to 
respond  to  a  situation  may  prove  disastrous.  Thus,  it  is  appropriate  to  focus  directly  on  this  type  of  error, 
that  is  to  say  “lapses.”  Two  quite  different  measures  of  failure  to  respond,  lapses  in  the  tapping  task 


15 


(i.e.,  inter-response  intervals  >  5  s)  and  pointer-reset  failures  on  the  visual  monitoring  task  in  Synwork,  are 
plotted  in  Figure  4. 


Time  Awake  (hours) 


Figure  4.  Response  omissions  (Lapses).  Pointer-reset  failures  on  the  Synwork  visual 
monitoring  task  and  lapses  (interresponse  times  >  5  sec)  on  the  tapping  task.  Points 
are  means  of  9  subjects. 


Both  measures  follow  the  same  pattern,  showing  large  and  progressively  greater  increases  on  Days 
2  and  3.  However,  the  increases  were  considerably  larger  for  the  tapping  task.  As  was  true  for  the 
previous  data,  these  measures  were  affected  following  the  first  night  of  sleep  deprivation,  with  much  larger 
effects  following  the  second  night  of  lost  sleep.  Statistically,  there  were  no  significant  effects  for  Synwork 
visual  monitoring  lapses,  but  both  day  (F(l,8  =  27. 7<  .01)  and  time  of  day  (F(3.5,28)  =  5.31<  .01)  were 
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significant  for  the  tapping  task.  Other  response-omission  measures  for  both  PAB  and  Synwork  showed 
similar  effects,  with  the  PAB  tests  generally  showing  larger  effects  of  sleep  deprivation. 

Discussion 


Variables  controlling  test  performance. 

The  data  presented  here  demonstrate  clear  differences  between  sleep-deprivation  effects  for  tests 
presented  in  the  traditional  sequential  performance  assessment  battery  format  and  the  concurrent  format 
provided  by  Synwork.  These  results  are  consistent  with  the  early  work  of  Wilkinson  (1964)  who  stated  in 
an  article  summarizing  a  large  number  of  sleep  deprivation  experiments: 

A  task  will  be  vulnerable  to  sleep  deprivation  (1)  as  it  is  complex,  and  (2)  as  it  is  lacking 
in  interest.  Of  the  two  factors,  that  of  incentive  may  be  the  more  influential,  such  that  a 
highly  complex  task  may  be  little  affected  by  sleep  deprivation  if  it  is  complex  in  an 
interesting  and  rewarding  way.  (p.  175) 

The  relationships  between  task  complexity,  motivation,  and  sleep  deprivation  were  also  discussed  by 
Froberg  (1985)  who  concluded  that  both  task  characteristics  and  motivational  factors  interact  to  determine 
the  impact  of  sleep  loss  on  performance.  These  observations  are  contrary  to  the  simplistic  view  that  more 
complex  tasks  are  necessarily  more  sensitive  to  disruption  by  sleep  deprivation  (or  other  stressors).  Indeed, 
the  present  data  support  the  view  that  a  dissociation  exists  between  the  complexity  of  a  task  and  its 
resistance  to  disruption  by  sleep  deprivation  (and  probably  other  stressors  as  well),  and  that  one  must  turn 
to  motivational  factors,  as  suggested  30  years  ago  by  Wilkinson,  for  an  explanation  of  the  relative 
insensitivity  of  Synwork  to  disruption  by  sleep  deprivation  and  circadian  rhythms. 

One  of  the  most  potent  motivational  variables  in  Wilkinson's  (1964)  experiments  was  immediate 
knowledge  of  results.  Regardless  of  the  complexity  of  a  task,  subjects  who  received  information  on  their 
performance  immediately  following  each  test  were  less  affected  by  sleep  loss  than  those  who  received  no 
information  about  their  performance.  In  typical  PAB  tests,  subjects  are  not  given  knowledge  of  results, 
which  undoubtedly  contributes  to  the  relative  fi’agility  of  performance  on  these  tests.  In  Synwork,  on  the 
other  hand,  subjects  are  fully  informed  of  their  progress  during  the  test.  From  an  operational  point  of  view, 
these  findings  suggest  that  tasks  having  relatively  immediate  consequences  might  be  expected  to  fare  better 
under  stressful  conditions  than  those  with  less  immediate  effects. 
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Synwork  is  frequently  treated  by  subjects  as  though  it  were  a  game.  Indeed,  it  has  many  of  the 
attributes  of  a  game,  including  simultaneous  action  in  several  locations,  motion  (though  it  is  very  simple), 
the  necessity  of  developing  a  strategy  for  "playing  the  game,"  awarding  of  points  for  performance,  and 
competition  (with  one’s  self  or  others).  Thus,  Synwork  provides  several  potential  sources  of  reinforcement. 
Points  can  serve  as  generalized  reinforcers  based  on  their  association  with  other  reinforcers  such  as  praise 
and  competition.  The  simple  opportunity  to  choose  courses  of  action  has  been  demonstrated  to  be 
reinforcing  (e.g.,  Catania,  1975).  The  relatively  complex  visual  and  auditory  environment  provided  by 
Synwork  may  also  be  a  source  of  reinforcement.  This  aspect  of  Synwork,  and  other  game-like  testing 
procedures,  may  have  practical  consequences  for  the  conduct  of  research  studies  requiring  repeated  testing 
for  extended  periods.  Subjects  are  motivated  (or  at  least  less  reluctant)  to  return  to  the  testing  situation, 
resulting  in  more  reliable  data  collection. 

One  place  where  Synwork  proved  comparable  to  a  standard  PAB  measure  was  in  addition 
accuracy,  which  degraded  in  a  highly  similar  manner  for  both  Synwork  and  the  two-column  addition  task. 
While,  based  on  these  data,  it  may  be  tempting  to  conclude  that  similar  processes  were  in  effect  in  both 
tests,  such  a  conclusion  is  premature.  It  is  interesting  to  note  that  the  addition  subtask  of  Synwork  is  one 
aspect  of  the  task  that  is  not  temporally  constrained.  Thus,  even  within  a  complex  and  generally  motivating 
task,  those  aspects  of  the  task  that  are  not  tightly  controlled  may  be  vulnerable  to  disruption. 

Circadian  Rhvthm  ftime-of-dav)  effects. 

While  there  were  some  clear  instances  of  circadian  rhythmicity  in  the  present  study,  there  were  also 
many  measures  which  showed  little  reliable  variation  with  time  of  day  (Table  1).  Inspection  of  Table  2  and 
Figure  3  suggests  that  the  lack  of  circadian  effects  for  most  of  the  PAB  tests  was  that  PAB  performance 
tended  to  be  suppressed  for  most  of  Days  2  and  3 .  On  the  other  hand,  when  Synwork  performance  failed 
to  be  rhythmic  it  was  because  there  was  little  change  from  baseline.  Thus  a  given  performance  can  fail  to 
demonstrate  a  circadian  rhythm  in  two  ways,  illustrated  by  curves  A  and  C  in  Figure  5.  Curve  B,  showing 
strong  circadian  rhythmicity,  is  intermediate  between  these  extremes.  Elsmore  and  Conrad  (1979,  reported 
in  Elsmore  and  Hursh,  1982)  presented  some  data  suggesting  that  the  three  curves  shown  in  Figure  5  may 
lie  on  a  continuum  determined  by  the  amount  of  reinforcement  maintaining  a  performance.  In  this 
experiment,  rhesus  monkeys  earned  food  by  pressing  on  a  panel  on  the  side  of  their  cage.  During  test 
sessions,  the  panel  could  be  illuminated  one  of  four  colors,  each  associated  with  a  different  frequency  of 
food  presentation,  and  test  sessions  were  conducted  six  times  each  day,  equally  spaced  around  the  clock. 
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Under  these  circumstances,  panel  pressing 
maintained  by  a  high  frequency  of  reinforcement 
showed  little  variation  with  time  of  day,  while 
panel  pressing  maintained  by  a  low  frequency  of 
pellet  presentation  was  strongly  rhythmic.  While 
it  is  a  considerable  leap  from  a  study  with 
nonhiunan  primates  to  the  present  study,  it  is  not 
unreasonable  to  speculate  that  similar 
motivational  processes  may  have  been  operating 
to  determine  circadian  rhythmicity,  particularly 
in  work  rate.  That  is,  Synwork  performance 
generally  showed  less  departure  from  baseline 
than  the  PAB  tasks  (see  Figures  1  and  3),  which  would  be  predicted  from  subjects’  reports  that  they  were 
more  strongly  motivated  to  work  on  Synwork. 

Prediction  of  operational  performance:  The  Performance  Assessment  Hierarchy. 

The  present  data  demonstrate  that  the  structure  of  a  test  is  critical  in  determining  the  degree  to 
which  the  test  is  sensitive  to  sleep  deprivation,  and  by  extension,  other  stressors.  Which  then  are  the  "best" 
tests  to  use?  If  this  question  is  not  qualified  in  some  manner,  there  may  be  no  reasonable  answer.  Figure  6 
presents  a  schematic  representation  of  the  trade-offs  that  come  into  play  as  one  traverses  the  continuum  of 
performance  assessment  techniques  from  performance  assessment  batteries  to  field  tests.  Sensitivity  and 
generality  of  tests  are  greatest  at  the  base  of  the  triangle.  Tests  at  this  level  evaluate  behavioral  functions 
that  are  common  to  many  different  types  of  operational  performance.  As  one  approaches  the  apex  of  the 
triangle,  test  situations  approach  operational  situations  of  interest.  Interpretability  or  "validity"  of  test  data 
increase,  however,  so  do  the  costs  (money,  personnel,  and  time)  of  conducting  the  test.  "Turf  issues,  (e.g., 
conflicts  between  training  and  research  missions),  also  come  into  play  when  researchers  desire  access  to 
system  simulators  or  field  exercises.  As  one  moves  from  the  controlled  environment  of  the  laboratory  into 
the  messy  operational  world,  control  of  relevant  variables  decreases,  resulting  in  a  concomitant  decrease  in 
ability  to  detect  significant  effects  of  stressors. 


Figure  5.  Sleep  deprivation  effects  on  three 
hypothetical  performances. 
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Figure  6.  Performance  assessment  hierarchy  and  tradeoffs. 


Thus,  the  question  of  what  type  of  test  is  best  in  any  given  situation  must  take  these  trade-offs  into 
account.  The  manner  in  which  a  research  question  is  framed  is  central  to  the  determination  of  what  type  of 
test  is  appropriate.  For  example,  the  question,  "How  does  sleep  deprivation  affect  performance?"  does  not 
specify  what  kind  of  performance,  and  thus  calls  for  tests  of  broad  generality,  that  is,  PAB-type  tests.  On 
the  other  hand,  the  question,  "How  does  sleep  deprivation  affect  performance  of  sonar  operatorsT  requires 
a  test  with  structural  and  functional  properties  that  more  closely  approximate  the  operational  situation 
feced  by  sonar  operators.  To  the  degree  the  testing  situation  differs  from  the  operational  setting,  detection 
of  a  performance  decrement  on  a  test  merely  suggests  that  a  similar  decrement  might  exist  under 
operational  conditions,  and  it  establishes  a  requirement  for  further  research  under  more  realistic  conditions. 
For  a  multitude  of  reasons,  in  many  cases  it  may  be  impossible  or  impractical  to  take  this  next  step. 

The  synthetic  work  approach  represents  an  attempt  to  begin  the  process  of  introducing  the 
structural  and  functional  characteristics  of  operational  jobs  into  laboratory  performance  testing.  Synwork 
was  not  intended  to  simulate  any  particular  operational  job,  but  rather  to  be  a  proof  of  concept.  It  is 
theorized,  and  these  data  suggest,  that  the  synthetic  work  approach  may  provide  a  “happy  medium” 
between  standard  computer-based  performance  assessment  batteries  and  part-task  or  full  simulations  and 
field  tests.  The  approach  retains  the  economy  and  convenience  of  PC-based  testing,  while  increasing  the 
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likelihood  that  test  data  will  be  relevant  to  the  operational  world.  It  remains  for  future  work  to  develop  and 
refine  the  approach,  through  the  development  of  test  systems  targeted  to  particular  operational 
environments. 
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